Carefully compare your experiments settings with those described in previous papers. The experimental settings include training/testing split, input format (e.g., image size), evaluation metric, backbone network, etc. If they are not exactly the same, you need to re-run their methods with exactly the same experimental settings as yours for fair comparison. For example, if the baseline uses 32x32 image size while you use 244x244 image size, that is unfair comparison. If the baseline uses AlexNet as backbone while you use ResNet as backbone, that is unfair comparison.
The baseline methods may use more or less information, compared with your method. Based on using more/less information and their performance is better/worse than yours, we could have the following table.
- If the baseline uses more information but achieves worse results, never mind. Just brag your own method.
- If the baseline uses less information but achieves better results, it is a serious problem. You have to check your method.
- If the baseline uses less information and achieves worse results, that is reasonable. But you may be accused of unfair comparison because your method uses more information. For safety, please augment the baseline with extra information and compare again.
If the baseline uses more information and achieves better results, that is reasonable. But do not casually put them in the paper because dumb reviewers may ignore these details and point out your method is not good enough. That is very risky! Please remove the extra information from the baseline and compare again.